Dataset statistics
| Number of variables | 14 |
|---|---|
| Number of observations | 244736 |
| Missing cells | 40732 |
| Missing cells (%) | 1.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 26.1 MiB |
| Average record size in memory | 112.0 B |
Variable types
| NUM | 9 |
|---|---|
| CAT | 3 |
| BOOL | 1 |
| DATE | 1 |
Reproduction
| Analysis started | 2020-07-12 21:52:17.743889 |
|---|---|
| Analysis finished | 2020-07-12 21:52:48.234682 |
| Duration | 30.49 seconds |
| Software version | pandas-profiling v2.9.0rc1 |
| Download configuration | config.yaml |
VERSIE has constant value "244736" | Constant |
DATUM_BESTAND has constant value "244736" | Constant |
PEILDATUM has constant value "244736" | Constant |
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1766 distinct values | High cardinality |
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPD | High correlation |
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPD | High correlation |
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAG | High correlation |
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAG | High correlation |
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPC | High correlation |
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPC | High correlation |
GEMIDDELDE_VERKOOPPRIJS has 40732 (16.6%) missing values | Missing |
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.1168) | Skewed |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 1 |
|---|
| Value | Count | Frequency (%) | |
| 1 | 244736 | 100.0% |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 2020-06-16 |
|---|
| Value | Count | Frequency (%) | |
| 2020-06-16 | 244736 | 100.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
| Distinct count | 1 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 2020-06-01 |
|---|
| Value | Count | Frequency (%) | |
| 2020-06-01 | 244736 | 100.0% |
Length
| Max length | 10 |
|---|---|
| Median length | 10 |
| Mean length | 10 |
| Min length | 10 |
JAAR
Date
| Distinct count | 9 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| Minimum | 2012-01-01 00:00:00 |
|---|---|
| Maximum | 2020-01-01 00:00:00 |
Histogram with fixed size bins (bins=9)
BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)
| Distinct count | 27 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 421.267 |
|---|---|
| Minimum | 301 |
| Maximum | 8418 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 301 |
|---|---|
| 5-th percentile | 302 |
| Q1 | 305 |
| median | 313 |
| Q3 | 322 |
| 95-th percentile | 335 |
| Maximum | 8418 |
| Range | 8117 |
| Interquartile range (IQR) | 17 |
Descriptive statistics
| Standard deviation | 918.833 |
|---|---|
| Coefficient of variation (CV) | 2.18112 |
| Kurtosis | 71.601 |
| Mean | 421.267 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 8.5722 |
| Sum | 1.03099e+08 |
| Variance | 844255 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=27)
| Value | Count | Frequency (%) | |
| 305 | 34655 | 14.2% | |
| 313 | 31766 | 13.0% | |
| 303 | 28291 | 11.6% | |
| 330 | 19677 | 8.0% | |
| 316 | 16667 | 6.8% | |
| 308 | 12302 | 5.0% | |
| 306 | 10125 | 4.1% | |
| 324 | 10078 | 4.1% | |
| 301 | 10029 | 4.1% | |
| 304 | 7973 | 3.3% | |
| Other values (17) | 63173 | 25.8% |
| Value | Count | Frequency (%) | |
| 301 | 10029 | 4.1% | |
| 302 | 5317 | 2.2% | |
| 303 | 28291 | 11.6% | |
| 304 | 7973 | 3.3% | |
| 305 | 34655 | 14.2% |
| Value | Count | Frequency (%) | |
| 8418 | 3182 | 1.3% | |
| 1900 | 161 | 0.1% | |
| 390 | 576 | 0.2% | |
| 389 | 2678 | 1.1% | |
| 362 | 3820 | 1.6% |
| Distinct count | 1766 |
|---|---|
| Unique (%) | 0.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| 101 | 1031 |
|---|---|
| 402 | 1002 |
| 403 | 983 |
| 301 | 978 |
| 201 | 926 |
| Other values (1761) |
| Value | Count | Frequency (%) | |
| 101 | 1031 | 0.4% | |
| 402 | 1002 | 0.4% | |
| 403 | 983 | 0.4% | |
| 301 | 978 | 0.4% | |
| 201 | 926 | 0.4% | |
| 203 | 923 | 0.4% | |
| 401 | 827 | 0.3% | |
| 404 | 814 | 0.3% | |
| 409 | 805 | 0.3% | |
| 802 | 800 | 0.3% | |
| Other values (1756) | 235647 | 96.3% |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.34829 |
| Min length | 2 |
ZORGPRODUCT_CD
Real number (ℝ≥0)
| Distinct count | 5885 |
|---|---|
| Unique (%) | 2.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.38386e+08 |
|---|---|
| Minimum | 1.0501e+07 |
| Maximum | 9.98418e+08 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1.0501e+07 |
|---|---|
| 5-th percentile | 2.8999e+07 |
| Q1 | 9.9799e+07 |
| median | 1.49599e+08 |
| Q3 | 9.90004e+08 |
| 95-th percentile | 9.90416e+08 |
| Maximum | 9.98418e+08 |
| Range | 9.87917e+08 |
| Interquartile range (IQR) | 8.90205e+08 |
Descriptive statistics
| Standard deviation | 4.28488e+08 |
|---|---|
| Coefficient of variation (CV) | 0.97742 |
| Kurtosis | -1.72574 |
| Mean | 4.38386e+08 |
| Median Absolute Deviation (MAD) | 1.196e+08 |
| Skewness | 0.47947 |
| Sum | 1.07289e+14 |
| Variance | 1.83602e+17 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 9.90004e+08 | 1801 | 0.7% | |
| 9.90004e+08 | 1754 | 0.7% | |
| 9.90003e+08 | 1734 | 0.7% | |
| 9.90004e+08 | 1377 | 0.6% | |
| 9.90356e+08 | 1221 | 0.5% | |
| 9.90003e+08 | 1127 | 0.5% | |
| 9.90356e+08 | 1125 | 0.5% | |
| 1.31999e+08 | 1111 | 0.5% | |
| 1.31999e+08 | 1088 | 0.4% | |
| 1.99299e+08 | 1034 | 0.4% | |
| Other values (5875) | 231364 | 94.5% |
| Value | Count | Frequency (%) | |
| 1.0501e+07 | 6 | < 0.1% | |
| 1.0501e+07 | 9 | < 0.1% | |
| 1.0501e+07 | 9 | < 0.1% | |
| 1.0501e+07 | 9 | < 0.1% | |
| 1.0501e+07 | 3 | < 0.1% |
| Value | Count | Frequency (%) | |
| 9.98418e+08 | 112 | < 0.1% | |
| 9.98418e+08 | 98 | < 0.1% | |
| 9.98418e+08 | 27 | < 0.1% | |
| 9.98418e+08 | 6 | < 0.1% | |
| 9.98418e+08 | 5 | < 0.1% |
| Distinct count | 8593 |
|---|---|
| Unique (%) | 3.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 495.397 |
|---|---|
| Minimum | 1 |
| Maximum | 152924 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 2 |
| median | 13 |
| Q3 | 96 |
| 95-th percentile | 1658 |
| Maximum | 152924 |
| Range | 152923 |
| Interquartile range (IQR) | 94 |
Descriptive statistics
| Standard deviation | 3088.28 |
|---|---|
| Coefficient of variation (CV) | 6.23395 |
| Kurtosis | 386.354 |
| Mean | 495.397 |
| Median Absolute Deviation (MAD) | 12 |
| Skewness | 16.4588 |
| Sum | 1.21242e+08 |
| Variance | 9.53748e+06 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 41601 | 17.0% | |
| 2 | 20219 | 8.3% | |
| 3 | 13096 | 5.4% | |
| 4 | 9689 | 4.0% | |
| 5 | 7535 | 3.1% | |
| 6 | 6265 | 2.6% | |
| 7 | 5193 | 2.1% | |
| 8 | 4380 | 1.8% | |
| 9 | 4058 | 1.7% | |
| 10 | 3534 | 1.4% | |
| Other values (8583) | 129166 | 52.8% |
| Value | Count | Frequency (%) | |
| 1 | 41601 | 17.0% | |
| 2 | 20219 | 8.3% | |
| 3 | 13096 | 5.4% | |
| 4 | 9689 | 4.0% | |
| 5 | 7535 | 3.1% |
| Value | Count | Frequency (%) | |
| 152924 | 1 | < 0.1% | |
| 151865 | 1 | < 0.1% | |
| 144569 | 1 | < 0.1% | |
| 127253 | 1 | < 0.1% | |
| 111784 | 1 | < 0.1% |
| Distinct count | 9136 |
|---|---|
| Unique (%) | 3.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 577.274 |
|---|---|
| Minimum | 1 |
| Maximum | 239907 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 14 |
| Q3 | 105 |
| 95-th percentile | 1870 |
| Maximum | 239907 |
| Range | 239906 |
| Interquartile range (IQR) | 102 |
Descriptive statistics
| Standard deviation | 3890.33 |
|---|---|
| Coefficient of variation (CV) | 6.73913 |
| Kurtosis | 716.993 |
| Mean | 577.274 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 21.1168 |
| Sum | 1.4128e+08 |
| Variance | 1.51346e+07 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1 | 40161 | 16.4% | |
| 2 | 19873 | 8.1% | |
| 3 | 12990 | 5.3% | |
| 4 | 9529 | 3.9% | |
| 5 | 7468 | 3.1% | |
| 6 | 6282 | 2.6% | |
| 7 | 5184 | 2.1% | |
| 8 | 4327 | 1.8% | |
| 9 | 3984 | 1.6% | |
| 10 | 3556 | 1.5% | |
| Other values (9126) | 131382 | 53.7% |
| Value | Count | Frequency (%) | |
| 1 | 40161 | 16.4% | |
| 2 | 19873 | 8.1% | |
| 3 | 12990 | 5.3% | |
| 4 | 9529 | 3.9% | |
| 5 | 7468 | 3.1% |
| Value | Count | Frequency (%) | |
| 239907 | 1 | < 0.1% | |
| 232508 | 1 | < 0.1% | |
| 231004 | 1 | < 0.1% | |
| 227757 | 1 | < 0.1% | |
| 219452 | 1 | < 0.1% |
| Distinct count | 7435 |
|---|---|
| Unique (%) | 3.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7503.59 |
|---|---|
| Minimum | 1 |
| Maximum | 208905 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 33 |
| Q1 | 365 |
| median | 1620 |
| Q3 | 6152 |
| 95-th percentile | 36256 |
| Maximum | 208905 |
| Range | 208904 |
| Interquartile range (IQR) | 5787 |
Descriptive statistics
| Standard deviation | 17523.2 |
|---|---|
| Coefficient of variation (CV) | 2.33531 |
| Kurtosis | 32.4077 |
| Mean | 7503.59 |
| Median Absolute Deviation (MAD) | 1492 |
| Skewness | 4.98213 |
| Sum | 1.8364e+09 |
| Variance | 3.07063e+08 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 4 | 466 | 0.2% | |
| 14 | 456 | 0.2% | |
| 17 | 454 | 0.2% | |
| 8 | 451 | 0.2% | |
| 20 | 434 | 0.2% | |
| 21 | 431 | 0.2% | |
| 12 | 425 | 0.2% | |
| 3 | 424 | 0.2% | |
| 26 | 422 | 0.2% | |
| 9 | 421 | 0.2% | |
| Other values (7425) | 240352 | 98.2% |
| Value | Count | Frequency (%) | |
| 1 | 375 | 0.2% | |
| 2 | 411 | 0.2% | |
| 3 | 424 | 0.2% | |
| 4 | 466 | 0.2% | |
| 5 | 408 | 0.2% |
| Value | Count | Frequency (%) | |
| 208905 | 19 | < 0.1% | |
| 208540 | 25 | < 0.1% | |
| 204417 | 17 | < 0.1% | |
| 202574 | 17 | < 0.1% | |
| 200177 | 16 | < 0.1% |
| Distinct count | 8202 |
|---|---|
| Unique (%) | 3.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 10523.4 |
|---|---|
| Minimum | 1 |
| Maximum | 338389 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 40 |
| Q1 | 470 |
| median | 2197 |
| Q3 | 8555 |
| 95-th percentile | 50745 |
| Maximum | 338389 |
| Range | 338388 |
| Interquartile range (IQR) | 8085 |
Descriptive statistics
| Standard deviation | 25320.4 |
|---|---|
| Coefficient of variation (CV) | 2.40611 |
| Kurtosis | 36.5898 |
| Mean | 10523.4 |
| Median Absolute Deviation (MAD) | 2040 |
| Skewness | 5.25404 |
| Sum | 2.57545e+09 |
| Variance | 6.41122e+08 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 3 | 416 | 0.2% | |
| 11 | 395 | 0.2% | |
| 4 | 390 | 0.2% | |
| 13 | 381 | 0.2% | |
| 17 | 374 | 0.2% | |
| 5 | 371 | 0.2% | |
| 10 | 357 | 0.1% | |
| 31 | 348 | 0.1% | |
| 6 | 346 | 0.1% | |
| 2 | 343 | 0.1% | |
| Other values (8192) | 241015 | 98.5% |
| Value | Count | Frequency (%) | |
| 1 | 332 | 0.1% | |
| 2 | 343 | 0.1% | |
| 3 | 416 | 0.2% | |
| 4 | 390 | 0.2% | |
| 5 | 371 | 0.2% |
| Value | Count | Frequency (%) | |
| 338389 | 25 | < 0.1% | |
| 338076 | 19 | < 0.1% | |
| 323541 | 20 | < 0.1% | |
| 299417 | 17 | < 0.1% | |
| 293998 | 17 | < 0.1% |
| Distinct count | 241 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 657361 |
|---|---|
| Minimum | 39 |
| Maximum | 1.48953e+06 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 39 |
|---|---|
| 5-th percentile | 27469 |
| Q1 | 248792 |
| median | 744739 |
| Q3 | 995547 |
| 95-th percentile | 1.33749e+06 |
| Maximum | 1.48953e+06 |
| Range | 1.48949e+06 |
| Interquartile range (IQR) | 746755 |
Descriptive statistics
| Standard deviation | 424374 |
|---|---|
| Coefficient of variation (CV) | 0.645573 |
| Kurtosis | -1.11151 |
| Mean | 657361 |
| Median Absolute Deviation (MAD) | 315360 |
| Skewness | 0.0157318 |
| Sum | 1.6088e+11 |
| Variance | 1.80093e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 880987 | 5102 | 2.1% | |
| 874258 | 4355 | 1.8% | |
| 843765 | 4348 | 1.8% | |
| 887818 | 4326 | 1.8% | |
| 860109 | 4254 | 1.7% | |
| 697863 | 3960 | 1.6% | |
| 1.08014e+06 | 3892 | 1.6% | |
| 1.06621e+06 | 3851 | 1.6% | |
| 1.0601e+06 | 3841 | 1.6% | |
| 1.04034e+06 | 3810 | 1.6% | |
| Other values (231) | 202997 | 82.9% |
| Value | Count | Frequency (%) | |
| 39 | 4 | < 0.1% | |
| 95 | 8 | < 0.1% | |
| 141 | 38 | < 0.1% | |
| 396 | 44 | < 0.1% | |
| 695 | 119 | < 0.1% |
| Value | Count | Frequency (%) | |
| 1.48953e+06 | 2976 | 1.2% | |
| 1.45066e+06 | 3054 | 1.2% | |
| 1.4219e+06 | 3564 | 1.5% | |
| 1.33749e+06 | 3540 | 1.4% | |
| 1.33322e+06 | 3547 | 1.4% |
| Distinct count | 241 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.03549e+06 |
|---|---|
| Minimum | 39 |
| Maximum | 2.54922e+06 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 39 |
|---|---|
| 5-th percentile | 33778 |
| Q1 | 364439 |
| median | 990055 |
| Q3 | 1.72779e+06 |
| 95-th percentile | 2.18701e+06 |
| Maximum | 2.54922e+06 |
| Range | 2.54919e+06 |
| Interquartile range (IQR) | 1.36335e+06 |
Descriptive statistics
| Standard deviation | 721901 |
|---|---|
| Coefficient of variation (CV) | 0.697156 |
| Kurtosis | -0.946127 |
| Mean | 1.03549e+06 |
| Median Absolute Deviation (MAD) | 652171 |
| Skewness | 0.274463 |
| Sum | 2.53423e+11 |
| Variance | 5.21141e+11 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 1.2118e+06 | 5102 | 2.1% | |
| 1.28143e+06 | 4355 | 1.8% | |
| 1.21594e+06 | 4348 | 1.8% | |
| 1.30346e+06 | 4326 | 1.8% | |
| 1.26438e+06 | 4254 | 1.7% | |
| 983937 | 3960 | 1.6% | |
| 2.54922e+06 | 3892 | 1.6% | |
| 2.49615e+06 | 3851 | 1.6% | |
| 2.54122e+06 | 3841 | 1.6% | |
| 2.06818e+06 | 3810 | 1.6% | |
| Other values (231) | 202997 | 82.9% |
| Value | Count | Frequency (%) | |
| 39 | 4 | < 0.1% | |
| 95 | 8 | < 0.1% | |
| 142 | 38 | < 0.1% | |
| 397 | 44 | < 0.1% | |
| 696 | 119 | < 0.1% |
| Value | Count | Frequency (%) | |
| 2.54922e+06 | 3892 | 1.6% | |
| 2.54122e+06 | 3841 | 1.6% | |
| 2.49615e+06 | 3851 | 1.6% | |
| 2.18701e+06 | 3757 | 1.5% | |
| 2.06818e+06 | 3810 | 1.6% |
| Distinct count | 3046 |
|---|---|
| Unique (%) | 1.5% |
| Missing | 40732 |
| Missing (%) | 16.6% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3480.45 |
|---|---|
| Minimum | 70 |
| Maximum | 287220 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 1.9 MiB |
Quantile statistics
| Minimum | 70 |
|---|---|
| 5-th percentile | 140 |
| Q1 | 455 |
| median | 1220 |
| Q3 | 3965 |
| 95-th percentile | 13130 |
| Maximum | 287220 |
| Range | 287150 |
| Interquartile range (IQR) | 3510 |
Descriptive statistics
| Standard deviation | 6624.36 |
|---|---|
| Coefficient of variation (CV) | 1.9033 |
| Kurtosis | 178.214 |
| Mean | 3480.45 |
| Median Absolute Deviation (MAD) | 995 |
| Skewness | 8.12602 |
| Sum | 7.10027e+08 |
| Variance | 4.38822e+07 |
| Monotocity | Not monotonic |
Histogram with fixed size bins (bins=50)
| Value | Count | Frequency (%) | |
| 160 | 1755 | 0.7% | |
| 105 | 1637 | 0.7% | |
| 110 | 1591 | 0.7% | |
| 180 | 1347 | 0.6% | |
| 300 | 1204 | 0.5% | |
| 140 | 1202 | 0.5% | |
| 120 | 1159 | 0.5% | |
| 145 | 1146 | 0.5% | |
| 165 | 1123 | 0.5% | |
| 500 | 1068 | 0.4% | |
| Other values (3036) | 190772 | 78.0% | |
| (Missing) | 40732 | 16.6% |
| Value | Count | Frequency (%) | |
| 70 | 226 | 0.1% | |
| 75 | 74 | < 0.1% | |
| 80 | 360 | 0.1% | |
| 85 | 869 | 0.4% | |
| 90 | 500 | 0.2% |
| Value | Count | Frequency (%) | |
| 287220 | 8 | < 0.1% | |
| 148910 | 3 | < 0.1% | |
| 142880 | 4 | < 0.1% | |
| 122155 | 4 | < 0.1% | |
| 116765 | 3 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.First rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999119 | 1 | 1 | 868 | 1374 | 248792 | 378680 | 1460.0 |
| 1 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999186 | 8 | 8 | 868 | 1374 | 248792 | 378680 | 495.0 |
| 2 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999040 | 19 | 37 | 868 | 1374 | 248792 | 378680 | 1150.0 |
| 3 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999206 | 750 | 1057 | 868 | 1374 | 248792 | 378680 | 245.0 |
| 4 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999117 | 12 | 12 | 868 | 1374 | 248792 | 378680 | 835.0 |
| 5 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999187 | 1 | 1 | 868 | 1374 | 248792 | 378680 | 530.0 |
| 6 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999154 | 15 | 15 | 868 | 1374 | 248792 | 378680 | 850.0 |
| 7 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999155 | 9 | 9 | 868 | 1374 | 248792 | 378680 | 1170.0 |
| 8 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999020 | 5 | 6 | 868 | 1374 | 248792 | 378680 | 2335.0 |
| 9 | 1.0 | 2020-06-16 | 2020-06-01 | 2015-01-01 | 324 | 306 | 131999022 | 11 | 12 | 868 | 1374 | 248792 | 378680 | 6700.0 |
Last rows
| VERSIE | DATUM_BESTAND | PEILDATUM | JAAR | BEHANDELEND_SPECIALISME_CD | TYPERENDE_DIAGNOSE_CD | ZORGPRODUCT_CD | AANTAL_PAT_PER_ZPD | AANTAL_SUBTRAJECT_PER_ZPD | AANTAL_PAT_PER_DIAG | AANTAL_SUBTRAJECT_PER_DIAG | AANTAL_PAT_PER_SPC | AANTAL_SUBTRAJECT_PER_SPC | GEMIDDELDE_VERKOOPPRIJS | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 244726 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630051 | 184 | 189 | 3334 | 5013 | 440630 | 747381 | 1275.0 |
| 244727 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630046 | 1 | 1 | 3334 | 5013 | 440630 | 747381 | NaN |
| 244728 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630045 | 56 | 59 | 3334 | 5013 | 440630 | 747381 | 2740.0 |
| 244729 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630047 | 28 | 33 | 3334 | 5013 | 440630 | 747381 | NaN |
| 244730 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630070 | 14 | 15 | 3334 | 5013 | 440630 | 747381 | 2660.0 |
| 244731 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630044 | 7 | 7 | 3334 | 5013 | 440630 | 747381 | NaN |
| 244732 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630069 | 1 | 1 | 3334 | 5013 | 440630 | 747381 | NaN |
| 244733 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630048 | 1 | 1 | 3334 | 5013 | 440630 | 747381 | NaN |
| 244734 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630053 | 2919 | 4029 | 3334 | 5013 | 440630 | 747381 | 260.0 |
| 244735 | 1.0 | 2020-06-16 | 2020-06-01 | 2018-01-01 | 316 | 3520 | 991630052 | 600 | 675 | 3334 | 5013 | 440630 | 747381 | 880.0 |